07. Filter, Drop Nulls, Dedupe

Filter, Drop Nulls, Dedupe

1. Filter

For consistency, only compare cars certified by California standards. Filter both datasets using query to select only rows where cert_region is CA . Then, drop the cert_region columns, since it will no longer provide any useful information (we'll know every value is 'CA').

2. Drop Nulls

Drop any rows in both datasets that contain missing values.

3. Dedupe

Drop any duplicate rows in both datasets.

Workspace

This section contains either a workspace (it can be a Jupyter Notebook workspace or an online code editor work space, etc.) and it cannot be automatically downloaded to be generated here. Please access the classroom with your account and manually download the workspace to your local machine. Note that for some courses, Udacity upload the workspace files onto https://github.com/udacity , so you may be able to download them there.

Workspace Information:

  • Default file path:
  • Workspace type: jupyter
  • Opened files (when workspace is loaded): n/a

QUIZ QUESTION: :

Match the values for the following features about the new dataset after filtering by certification region.

ANSWER CHOICES:



Feature

Value

14

798

13

823

10

2404

1611

1084

SOLUTION:

Feature

Value

798

13

1084